Generating and Applying Rules for Web Documents Retrieval
نویسندگان
چکیده
Web documents retrieval is very challenging due to the huge amount of documents available and difficulty to interpret these documents. Both effectiveness and efficency of retrieval are important. This paper presents some approaches from soft computing to improve effectiveness of web documents retrieval. These approaches give a more accurate and reasonable representation of terms provided by the user, present how to match terms and documents with fuzzy logic techniques, and show how to reduce the number of matched documents and necessary terms. A possible architecture for a neuro-fuzzy system to match terms and documents is sketched. This paper also discusses how to form linguistic rules by polling a panel of experts.
منابع مشابه
Reverse Engineering for Web Data: From Visual to Semantic Structure
Despite the advancement of XML, the majority of documents on the Web is still marked up with HTML for visual rendering purposes only, thus building a huge amount of ”legacy” data. In order to facilitate querying Web based data in a way more efficient and effective than just keyword based retrieval, enriching such Web documents with both structure and semantics is necessary. This paper describes...
متن کاملReverse Engineering for Web Data: From Visual to Semantic Structures
Despite the advancement of XML, the majority of documents on the Web is still marked up with HTML for visual rendering purposes only, thus building a huge amount of ”legacy” data. In order to facilitate querying Web based data in a way more efficient and effective than just keyword based retrieval, enriching such Web documents with both structure and semantics is necessary. This paper describes...
متن کاملMining Technique Using Association Rules Extraction
automatically extracting association rules from collections of textual documents. The technique called, Extracting Association Rules from Text (EART). It depends on keyword features for discover association rules amongst keywords labeling the documents. In this work, the EART system ignores the order in which the words occur, but instead focusing on the words and their statistical distributions...
متن کاملUsing NLP techniques to create legal ontologies in a logic programming based web information retrieval system
Web legal information retrieval systems need the capability to reason with the knowledge modelled by legal ontologies. Using this knowledge it is possible to represent and to make inferences about the semantic content of legal documents. In this paper a methodology for applying NLP techniques to automatically create a legal ontology is proposed. The ontology is defined in the OWL semantic web l...
متن کاملA Text Mining Technique Using Association Rules Extraction
This paper describes text mining technique for automatically extracting association rules from collections of textual documents. The technique called, Extracting Association Rules from Text (EART). It depends on keyword features for discover association rules amongst keywords labeling the documents. In this work, the EART system ignores the order in which the words occur, but instead focusing o...
متن کامل